Arabic WebPages Classification Based On Fuzzy Association
نویسندگان
چکیده
Information retrieval from web documents becomes a challenge according to the exponential growth in the number of web pages on the Internet and rapid changes in these pages. So, it is necessary to classify web pages into classes to provide their results for the applied tools to be used which makes information retrieval easier and helps in facilitating applications use on the internet. Web pages classification may be applied for various types of data that are available on the Internet like texts, images, audios and videos. Each type has different algorithms to classify and process. Unfortunately, the classification of Arabic Web pages considering their structure is more difficult. In this paper, the results of the classification of Arabic web pages which is obtained using fuzzy algorithm will be discussed.
منابع مشابه
Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملArabic Document Summarization Using Fa Fuzzy Ontology
Ontology is the basis of sharing and reusing knowledge on the semantic web. The fuzzy ontology is an extension of the domain ontology for solving the uncertainty problems. Although many earlier methods tried to create a fuzzy ontology and applied it to documents summarization, none of the previous methods have discussed Arabic document summarization using FA fuzzy ontology which could benefit f...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملFuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites' Hyperlinks and Web Pages
Quality design of websites implies that among other factors, hypelinks’ structure should allow the users to reach the information they seek with the minimum number of clicks. This paper utilises the fuzzy equivalence relation based clustering in adapting website hyperlinks’ structure so that the redesigned website allows users to meet as effectively as possible their informational and navigatio...
متن کامل